Scientific registers and disciplinary diversification: a comparable corpus approach
نویسندگان
چکیده
We present a study on linguistic contrast and commonality in English scientific discourse on the basis of a monolingually comparable corpus. The focus is on selected scientific disciplines at the boundaries to computer science (computational linguistics, bioinformatics, digital construction, microelectronics). The data basis is the English Scientific Text Corpus (SCITEX) which covers a time range of roughly thirty years (1970/80s to early 2000s). In particular, we investigate the disciplinary diversification/relatedness of scientific research articles in terms of register. Our results are relevant for research on multilingually comparable corpora as used in machine translation and related research, since they shed new light on the notion of ‘comparablity’.
منابع مشابه
Data Mining with Shallow vs. Linguistic Features to Study Diversification of Scientific Registers
We present a methodology to analyze the linguistic evolution of scientific registers with data mining techniques, comparing the insights gained from shallow vs. linguistic features. The focus is on selected scientific disciplines at the boundaries to computer science (computational linguistics, bioinformatics, digital construction, microelectronics). The data basis is the English Scientific Tex...
متن کاملACADEMIC WRITING REVISITED: A PHRASEOLOGICAL ANALYSIS OF APPLIED LINGUISTICS HIGH-STAKE GENRES FROM THE PERSPECTIVE OF LEXICAL BUNDLES
Lexical bundles are frequent word combinations that commonly appear in different registers. They have been the subject of much research in the area of corpus linguistics during the last decade. While most previous studies of bundles have mainly focused on variations in the use of these word combinations across different registers and a number of disciplines, not much research has been done to e...
متن کاملFeature Discovery for Diachronic Register Analysis: a Semi-Automatic Approach
In this paper, we present corpus-based procedures to semi-automatically discover features relevant for the study of recent language change in scientific registers. First, linguistic features potentially adherent to recent language change are extracted from the SciTex Corpus. Second, features are assessed for their relevance for the study of recent language change in scientific registers by mean...
متن کاملDyVSoR: dynamic malware detection based on extracting patterns from value sets of registers
To control the exponential growth of malware files, security analysts pursue dynamic approaches that automatically identify and analyze malicious software samples. Obfuscation and polymorphism employed by malwares make it difficult for signature-based systems to detect sophisticated malware files. The dynamic analysis or run-time behavior provides a better technique to identify the threat. In t...
متن کاملMetadiscourse in Applied Linguistics and Chemistry Research Article Introductions
This study examined disciplinary rhetoric in research articles, focusing on different traditions in structuring text discourses from a metadiscourse-move analytic approach. The corpus consisted of 72 research article Introductions (RAIs): 36 in applied linguistics and 36 in chemistry. Swales’ CARS model (1990, 2004) and Hyland’s interpersonal model of metadiscourse (2005) were used as analytica...
متن کامل